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Abstract —This paper considers a heterogeneous ad hoc net¬ 
work with multiple transmitter-receiver pairs, in which all 
transmitters are capable of harvesting renewable energy from 
the environment and compete for one shared channel by random 
access. In particular, we focus on two different scenarios: the 
constant energy harvesting (EH) rate model where the EH rate 
remains constant within the time of interest and the i.i.d. EH 
rate model where the EH rates are independent and identically 
distributed across different contention slots. To quantify the roles 
of both the energy state information (ESI) and the channel 
state information (CSI), a distributed opportunistic scheduling 
(DOS) framework with two-stage probing and save-then-transmit 
energy utilization is proposed. Then, the optimal throughput and 
the optimal scheduling strategy are obtained via one-dimension 
search, i.e., an iterative algorithm consisting of the following two 
steps in each iteration: First, assuming that the stored energy level 
at each transmitter is stationary with a given distribution, the 
expected throughput maximization problem is formulated as an 
optimal stopping problem, whose solution is proven to exist and 
then derived for both models; second, for a fixed stopping rule, 
the energy level at each transmitter is shown to be stationary and 
an efficient iterative algorithm is proposed to compute its steady- 
state distribution. Finally, we validate our analysis by numerical 
results and quantify the throughput gain compared with the best- 
effort delivery scheme. 

Index Terms —Distributed opportunistic scheduling, energy 
harvesting, optimal stopping. 


I. Introduction 

Conventional wireless communication devices are usually 
powered by batteries that can provide stable energy supplies. 
However, the battery lifetime limits the operation time of 
such devices. Recently, energy harvesting (EH) techniques 
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have been proposed as a promising alternative to the con¬ 
ventional constant power supplies 0, 0, which is capable 
of transferring the renewable energy from the environment 
into electrical energy. In this way, the node lifetime can 
be prolonged significantly. Compared with the conventional 
constant energy suppliers, transmitters powered by energy 
harvesters are restricted by a new class of EH constraints, 
i.e., the consumed energy up to any time is bounded by the 
harvested energy until this point 0. Therefore, to meet certain 
performance requirements, such as throughput, stability, delay, 
etc., these EH constraints should be carefully taken into 
account in the design of EH-based communication systems. 

A. Related Works and Motivations 

Communication systems powered by energy harvesters have 
been investigated in recent years. For the point-to-point wire¬ 
less systems, the authors in 0 0 considered the throughput 
maximization problem over a finite horizon for both the 
cases that the harvested energy information is non-causally 
and causally known to the transmitter, where the optimal 
solutions were obtained by the proposed one-dimension search 
algorithm and dynamic programming (DP) techniques, respec¬ 
tively. In 0, the authors extended the results to the classic 
three-node Gaussian relay channel with EH source and relay 
nodes, where the optimal power allocation algorithms were 
proposed. With a more practical circuit model by considering 
the half-duplex constraint of the battery, the authors in 0 
proposed a save-then-transmit protocol, which divides each 
transmission frame into two parts: the first one for harvesting 
energy and the other for data transmission. For wireless 
networks with EH constraints, the authors in 0 investigated 
the performance of some standard medium access control 
protocols, e.g., TDMA, framed-Aloha, and dynamic-framed- 
Aloha. 

In related works on ad hoc networking, opportunistic 
scheduling has been known as an effective method to utilize 
the wireless resource 0-El. In particular, a distributed 
opportunistic scheduling (DOS) scheme was introduced in 
El, El, where only local channel state information (CSI) 
is available to each transmitter. By applying optimal stopping 
theory El, it has been shown in El, Ifl3l that the optimal 
solution for the expected throughput maximization problem 
has a threshold-based structure. When channel estimation is 
imperfect, the authors in lfl5l proposed a two-level channel 
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probing framework that allows the accessing transmitter to 
perform one more round of channel estimation before data 
transmission to improve the quality of estimated CSI and pos¬ 
sibly increase the system throughput. The optimal scheduling 
policy of the two-level probing framework was proven to be 
threshold-based as well by referring to the optimal stopping 
with two-level incomplete information Go- 

Different from the traditional energy supplies (e.g., non- 
rechargeable batteries, power grid) in the conventional net¬ 
works g|-ED, G), we consider the network powered by 
energy harvesters that could generate electric energy from 
different renewable energy sources. Among various types of 
renewable energy sources, we consider two typical energy 
harvesting rate models in this papeiQ: 

1) Constant energy harvesting rate model : The EH rate 
(specifically, the amount of harvested energy per unit 
time) can be approximated as a constant within the 
entire time duration of interest. For example, the power 
variation coherence time of wind and solar EH systems 
is on the order of multiple seconds ED, ED, while the 
duration of one communication block is about several 
milliseconds. Thus, over thousands of communication 
blocks, the EH rate keeps almost the same. 

2) Independent and identically distributed (i.i.d.) energy 
harvesting rate model : Compared to the constant rate 
model, the EH rate for this case changes much faster, i.e., 
comparable to the duration of one communication block. 
For example, the energy from light, thermal, kinetic, or 
ambient-radiation sources, usually changes every several 
milliseconds. Accordingly, EH rates can be modeled as 
an i.i.d. 0, ® random process. 

With the above two EH models, we investigate the DOS 
problem for a heterogeneous EH-based network, where the 
channel gains across different links and the EH rates across 
different transmitters are non-identical. The system works in a 
two-stage pattern as follows. In the first stage, all transmitters 
adopt random access and do channel probing (CP), during 
which the successful link can obtain the CSI via channel 
contentions, similar to those in CD, G), G). In the second 
stage, the successful transmitter at the first stage has the option 
to spend certain time to harvest more energy, i.e., executes 
energy probing (EP); and then, with the updated energy state 
information (ESI), it decides either to transmit in the rest of the 
transmission block, or to stop probing and give up the channel. 
With EP, since the total duration of the transmission block is 
fixed, although spending more time on harvesting energy could 
increase the energy level, it decreases the portion of the time 
for data transmission, which leads to a tradeoff to optimize. 

B. Summary of Contributions 

We propose a DOS framework for an ad hoc network 
powered by energy harvesters, which efficiently utilizes both 
the CSI and the ESI at each transmitter. In this framework, 
we adopt a “save-then-transmit” scheme, i.e., the transmitter 

'A more general case is that the transmitter only has causal information 
about EH rates, which could be modeled as a Markov process. This model 
has been used in the point-to-point wireless system a, 0 . 


keeps harvesting energy before it initiates the transmission that 
uses up all the available energy in the battery. Note that such 
a greedy power utilization scheme is suboptimal in general, 
while it is sensible when the number of transmitters is large. 

The main contributions of this paper are summarized as 
follows: 

1) First, by assuming that the battery state at each transmitter 
is stationary with a certain distribution, the throughput 
maximization problem for the considered network is cast 
as a rate-of-return problem. We prove the existence of the 
optimal stopping rules for both EP and CP, and further 
obtain: 

• For the constant EH model, the optimal stopping rule 
of EP is determined by maximizing the throughput 
over the transmission block before starting EP, and 
it is either zero or a finite value according to the 
given CSI and ESI. Then, based on the stopping rule 
of EP, the optimal stopping rule of CP is shown to 
be a pure threshold policy (the threshold does not 
change over time) and the transmission decision is 
made right after each round of CP. 

• For the i.i.d. EH model, the optimal stopping rule 
for EP is shown to be dynamic and threshold based, 
which is obtained by solving a stopping problem over 
a finite-time horizon. The stopping rule of CP is also 
threshold based and obtained based on the decision of 
EP, i.e., either transmit or start a new CP. Unlike the 
constant case, the transmission decision under i.i.d. 
EH model is made during the process of EP. 

2) Next, with a fixed stopping rule, we show the existence 
of the steady-state distribution of the battery state by 
constructing a “super” Markov chain with its states being 
jointly determined by all transmitters. Moreover, we 
propose an efficient iterative algorithm to compute the 
steady-state distribution, executed at each transmitter in 
parallel. Particularly, it is shown that with the constant EH 
model, if the network consists of n transmitters and each 
one is with m possible energy states, the computational 
complexity for one iteration of the proposed algorithm is 
on the order of O ( 'n 2 m 2 ), which is more efficient (when 
n and m are large) than that of the super Markov chain 
case, whose complexity for one iteration is on the order 
of O (2 m 2n ). 

3) Finally, by exploiting the structure of the rate-of-return 
problem, we show that the maximum throughput and the 
optimal scheduling strategy of the DOS framework could 
be obtained for both the two EH rate models, via one- 
dimension search by repeating the above two steps. 

The rest of this paper is organized as follows. Section [IT] 
introduces the system model. In Section |III1 the throughput 
maximization problem is formulated and solved under the 
assumption that the stationary distribution of the battery at 
each transmitter is known. Then, with the obtained stopping 
rule, we prove in Section [Iv] the existence of the steady- 
state distribution for each transmitter, and propose an iterative 
algorithm to compute it. Section [V] discusses the computation 
for the optimal throughput. In Section lVTl numerical results are 
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Fig. 1. One realization for the DOS with two-stage probing. 


provided to validate our analysis and evaluate the throughput 
gain of our proposed scheduling scheme against the best-effort 
delivery. Finally, Section [VTII concludes the paper. 

II. System Model 

We consider a heterogeneous single-hop ad hoc network, 
where all the / transmitter-receiver pairs have independent but 
not necessarily identical statistical information of CSI and ESI. 
All pairs contend for one shared channel by random access. 
For each link, the transmitter is powered by a renewable energy 
source and utilizes a small rechargeable battery to temporally 
store the harvested energy. Note that the transmitter could 
keep harvesting energy until it initiates a data transmission. 
In addition, we do not consider the effect of inefficiency in 
energy storage and retrieval, nor the energy consumed other 
than data transmission, which can be approximately neglected 
by properly adjusting the energy model ffl-ia, no. Denote 
the duration of one channel contention as l > 0, and the length 
of one transmission block as L, which is an integer multiple 
of l. 

As illustrated in Fig. Q] the DOS procedure of the whole 
network takes place in two stages: First, each transmitter 
probes the channel via random access and harvests energy at 
the same time; and then the successful transmitter may start the 
EP (to potentially increase the average transmission rate over 
the transmission bloclJD) before the data transmission process. 

1) Channel probing: In the first stage, a successful channel 
contention is defined as follows: All transmitters first inde¬ 
pendently contend for the channel until there is only one 
contending in a particular time slot. Furthermore, one round of 
CP is defined as the process to achieve one successful channel 
contention. Denote the probability that transmitter i contends 
for the channel as qi, 1 < i < I, with 0 < q, < 1. As such, 
the probability that the i-th transmitter successfully occupies 
the channel is given by Qi = ftTW 1 ~Qj)- Then, the 
probability to achieve one successful channel contention at 
each time slot is given by Q = X^=i and it is easy to 
check that Q < i ED- Accordingly, for the n-th round of 
CP, n > 1, we use K n to denote the number of time slots 
needed to achieve a successful channel contention, which is a 
random variable and satisfies the geometric distribution with 
parameter Q llT2l . ifOl , fl5l . In this way, the expected duration 
of one round of CP is given as l/Q. Denote the transmitted 


signal at transmitter i as x l , and the received signal y l is 
thus given by y' = h l x z + z l , where h l is the complex 
channel gain and z l is the circularly symmetric complex 
Gaussian (CSCG) noise with zero mean and variance er 2 at 
the receiver. Across different links, {/P}i<i</ are independent 
with finite mean and variance, while not necessarily identically 
distributed. After one round of CP, the successful transmitter 
can perfectly estimate the corresponding channel gain via 
certain feedback mechanisms, and thus h 1 is assumed a known 
constant during the whole transmission block. After CP, the 
successful transmitter chooses one of the following actions 
based on its local CSI and ESI: 

(a) releases the channel (if the CSI and ESI indicate that the 
transmission rate is lower than a threshold) and let all links 
re-contend; or 

(b) directly transmits until the end of the transmission block; 
or 

(c) holds the channel, starts EP. 

Note that to complete one data transmission, it may take 
n rounds of CPs as depicted in Fig. Q] It is worth noting 
that each transmitter keeps harvesting energy until it starts a 
transmission, and after each round of CP, only the successful 
transmitter makes a choice among three actions as listed above. 

2) Energy Probing: When the successful transmitter de¬ 
cides not to take action (a) or (b) defined above, it starts the 
second stage EP, i.e., action (c), to obtain more energy. During 
this stage, the transmitter chooses to continue harvesting 
energy slot by slot, and then ends EP by action (a) or (b), 
i.e., either releasing the channel or transmitting over the rest 
of the transmission block. As it is depicted in Fig. [Q one 
transmission is fulfilled with n rounds of CPs and m n extra 
slots of EP. 

For transmitter i, let B' n m g A denote the energy level of 
the battery after the n-th round of CP and m additional time 
slots for EP, where A = {0, 6, 26, ■ ■ ■ , B max 6} is the set of all 
possible energy states, with <5 being the minimum energy unit 
and B max 5 the capacity of the battery. We use E’ t to denote 
the EH rate of transmitter i at time t. As noted in the previous 
section, we consider the following two types of scenarios: 

1) Constant EH rate model: {El } are constants for each 
i, i.e., El = E 1 G A for all t > 1, and { E *} can 
thus be learned and assumed non-causally known before 
transmissions. 

2) l.i.d EH rate model: The EH rates among different 
transmitters are independent. For transmitter i, {E[{ t>1 
are i.i.d. across t, with finite mean and the probability 
mass function (PMF) Pr {E[ = ed} = F l (e), where 
e G {0,1, 2, • • • }. 

Under the save-then-transmit scheme, the energy level will 
keep non-decreasing and drop to zero after the transmission, 
which forms a Markov chain (as described in SectionllVllater). 
Thus, the energy level B l n m can be written as 


Bl . = min • 


Bio 


+ lY / El,B ri 

k =0 


( 1 ) 


2 If the successful transmitter experiences a bad channel condition and a where n > 1 , 0 < TTl < L/l, and min{x,t/} denotes the 
low energy level, it may skip the transmission. smaller value between two real numbers x and y. Note that 
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B l n 0 indicates the energy level after the successful contention 
round before taking any action. If m = 0, i.e., transmitter i 
does not do EP, we let Y^k=o = Eq = 0. 

III. Transmission Scheduling 

In this section, we target to derive the optimal scheduling 
policy that maximizes the average throughput for the con¬ 
sidered network with the proposed two-stage access strategy, 
conditioned on the given battery state distribution. We point 
out that the results obtained in this section are based on the 
assumption that the energy level at transmitter i is stationary 
with a given distribution II 1 , for 1 < i < I, which will be 
validated in Section ITVl 

A. Problem Formulation 

After the n-th round of CP and m additional time slots, 
the CSI and the ESI at the successful transmitter are given as 
J- l n rtl = {h l n , B l n . Note that the channel gain h l n is now 
indexed by n, which is determined at the end of the n-th round 
of CP and assumed fixed during the whole data transmission 
block. In particular, T* 0 = { h l n , /i® 0 } denotes the initial 
information right after the n-th round of CP. For convenience, 
we omit the index i for either the CSI or the ESI in the sequel, 
and retrieve it when necessary. 

By adopting the save-then-transmit scheme at the trans¬ 
mitters to fully take advantage of each channel use, the 
transmission rate over L/l time slots with state T n ln is defined 
as 

= (l ~ x) ‘°g (® + -WfrO ■ <2) 

When ml = L, we set R n {m) = 0 since there is no 
transmission in this case. 

Remark 3.1: Some important properties of R n (m ) are 
listed as follows. 

• E [R n {m)] < oo and E [( R n (m)) 2 ] < oo, which results 
from the fact that h n has finite mean and variance and 
the energy level B n>m is also finite. 

• {f?„(m)}„>i are approximately independent random 
variables over n. To see this, recall that the channel gains 
and the battery states are independent across different 
transmitters at a given time slot; moreover, the probability 
is small for a transmitter to occupy the channel in two 
consecutive contentions when the number of user pairs 
is large. For example, in an ad hoc network with K 
pairs where each pair fairly competes for the channel 
use with probability 1/K, such a probability is -^(1 — 
l/^) 2 (if-i) |T9l, W hich is as small as 0.0625 even when 
K = 2. Thus, {3F n ,m}n >t are nearly independent over 
n, which implies that { R n (m) } n>1 are independent over 
n. 

Let N be the stopping rule for CP, and M n be the stopping 
rule for EP associated with the n-th CP for 1 < n < N, which 
together tell the transmitter when to start the data transmission. 
Then, under these stopping rules, the transmission rate would 
be Rn(Mn), and we let T n be the total time duration 
for completing one data transmission. Here, Tn contains the 


duration of N —1 rounds of CP, which is given by l ^2^=1 K n , 
and l Yln=i M n time slots in which the transmitter probes the 
energy but gives up the channel after EP. Also, after the iV-th 
round of CP with the time KnI, the transmitter may use Mn 
slots for the EP and transmit within the duration L — MnI 
afterwards. Accordingly, we obtain 

IV-l N 

T n = l J2 + 1 H R n + L. ( 3 ) 

n —1 n —1 

If such a process is executed J times with Rn,{Mn,)L bits 
transmitted at each transmission, 1 < j < J, we obtain the 
average throughput A per transmission of the network: 

L £/=i Rn, {M Nj ) N LE [R N ( M N )] 

Ei=i t n, ~ a ' s - 

as J —>• oo by the renewal theory H20I . Again, we point out 
that the energy level is stationary at the Nj-th round of CP 
for j > 1, as we assumed. 

Our target is to maximize A by adjusting the stopping rule 
N and {M n } 1 < n <jv- It is easy to see that maximizing A is in 
fact a “rate-of-return” stopping problem fl4l . ED (for which 
the specific definition is given later). Instead of directly solving 
this problem, we examine the “net reward” of the considered 
network, which is given as 


Cv(A) = Rn{M n )L — A T n 


={Rn{Mn) — A )L — A l 


N-l 

Kn + {Kn + M n ) , 

71—1 


(4) 


for some A > 0. The term (Rn{Mn) — X)L can be interpreted 
as the reward of transmission, A lK n as the cost of CP, and 
A lM n as the cost of failed EP for 1 < n < N — 1. We set 
r_oo(A) = —oo since it is irrational that the system does not 
send any data forever. Then, we define the maximum value of 
the expected net reward with A > 0 as 


S'* (A) = sup E [rjv(A)], (5) 

iVeA^,{M„} 1 < n < w 

where sup(-) denotes the least upper bound for a set of real 
numbers, and 

AT = {N:N>1, E[T/v] < oo, 

for M n £ [0 ,L/l\ with 1 < n < N} . (6) 


Remark 3.2: One important property of problem ([5]i is time 
invariance. We observe that before the system starts the N-th 
round of CP, the accumulated cost A l En=i (Rn + M n ) over 
the past N — 1 rounds of CP has already been finalized, with 
no need to be further considered in the remaining decision 
process. Moreover, { R. n (M rl ) } | < n <,.y are independent over n 
as we mentioned before; it follows that the expected optimal 
reward before the IV-th round of CP is the same as that of any 
previous round of CP. In other words, the system can obtain 
the expected optimal reward S'* (A) whenever a new round of 
CP is about to start. Therefore, we conclude that problem ((5) 
is time invariant. 

Recall from Section [TT] that after each round of CP, the 
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successful transmitter will choose one of three actions (i.e., 
transmitting, giving up the channel, or starting EP) according 
to the stopping rule of CP, which needs the expected reward 
of EP depending on the stopping rule of EP Thus, we will 
first introduce the formulation and the optimal stopping rule 
for EP, and then for CP 

1) Formulation for EP: When the successful transmitter 
starts EP after the n- th round of CP, where 1 < n < TV, it will 
end up with one of the two actions: transmitting or giving up 
the channel without transmission. Specifically, we define the 
expected optimal reward at the fc-th slot of EP, 0 < k < L/l, 
as 


U k (R n ,k) = max E [max {{R n (M n ) — X)L, 

k < M n <L/l 

-XlM n + S'* (A)} I R n ,k] , (7) 

where —A lM n + S'*(A) is the expected value of giving up the 
channel after M n slots of EP. If k = 0, Uo(R n fi) denotes the 
maximum of the expected net reward right after the n-th round 
of CP. In other words, we want to find the optimal stopping 
rule M* of EP which attains 


Uo(R n ,o) = max E [max{(i? n (M„) - A )L, 

0 <M n <L/l 

-XlM n + S*(A)} | R n $]. (8) 


Note that M* exists since problem ® is an optimal stopping 
problem over a finite time horizon m, E2- 

2) Formulation for CP: By choosing {M*}i<„<jv, we 
define 


A* = sup 
NeAf 


m [r n {m* n )} 


E [Tj 


N 


TV* = arg sup 
ngAT 


LE[R n (M* n )] 


E [Tj 


N 


(9) 


Note that if the optimal stopping rule TV* (/ Af, we would 
claim that TV* does not exist. Thus, A* is the optimal average 
throughput of the original rate-of-return problem. 

The connection between the transformed problem © and 
the original problem © is introduced in the following lemma. 
It is worth noticing that with the optimal stopping rule 
{M*}!< n <jv for EP, problem © boils down to a one-level 
stopping problem with stopping rule TV. 

Lemma 3.1: (i) If there exists A* such that S*( A*) = 0, 
this A* is the optimal throughput defined in @. Moreover, 
if S*( A*) = 0 is attained at TV*(A*), the stopping rule TV* 
defined in © is the same as TV*(A*), i.e., TV* = TV*(A*). 

(ii) Conversely, if © is true, there is S'*(A*) = 0, which is 
attained at TV* given by @. 

This lemma directly follows Theorem 1 in Chapter 6 of M- 
The next proposition secures the existence of the optimal 
stopping rule for CP. 

Proposition 3.1: With the EP stopping rule {M*} 0 < n <jv, 
the optimal stopping rule TV* (A) for problem © exists. 
Moreover, for TV > 1, the following equation holds 


S*(A) = U 0 (Rn,o) — XIK n . (10) 


The proof is given in Appendix A. 

Remark 3.3: The equation (TTTTb is obtained from the op¬ 
timality equation of the CP. The calculation of the optimal 
throughput relies on this equation, which will be shown in 


Section [V] 

Now, we are ready to derive the optimal stopping rules TV* 
and {.M*} that jointly maximize the expected value of rjv(A) 
for the two different EH models. As we mentioned above, the 
stopping rule TV for CP relies on the form of Mjv (the stopping 
rule for EP). We will find the optimal stopping rule Mf before 
TV*. After obtaining the forms of the optimal stopping rules, 
the calculation for the optimal throughput will be discussed. 


B. Optimal Stopping Rule for Constant EH Model 


For notation simplicity, we omit the index TV of CP when 
we derive the stopping rule M in this subsection. Then, we 
will derive the stopping rule TV based on the results of EP. 

When the EH rate is constant, the transmission rate R(M) 
is deterministic for a given Rq over the transmission block. 
Then, we obtain a simplified version of Uo(Ro) © as 

Uo(Ro) = max ma,x{(R(M) — X)L,—XIM + S*(X)} . 

0<M<L/l 

The value of Uq{Rq) can be obtained simply by comparing 
— XIM + S* (A) and ( R(M ) — A )L, whose values can be com¬ 
puted individually. Clearly, the first one achieves its maximum 
S'* (A) at M = 0. For the second term, only R(M) is changing 
over M with a given Rq. Therefore, we settle down to the 
following auxiliary problem: 


V* = arg max R(V). (11) 

o <V<L/l 

Then, we could use the optimal V* to find M* without 
difficulty. Note that when VI = L, it follows that R(V) = 0 
according to our definition in Section [III which implies that 
V = L/l cannot be optimal, and thus we take 0 < V < 
L/l — 1. We first consider a related continuous version of 
R(V) by relaxing VI/L as p, 0 < p < 1: 


max R(p) = max (1 — p) 

0<p<l 0<p<l 

1 (i , |L |2 min {-B 0 +pLE, B max 5} 

' log ( +w -HU- 

After solving (ITtT i. we will show how to obtain the optimal 
solution of problem ( fTTl . 

First, we establish some properties for the objective function 
of problem (Ti~2l) . 

Proposition 3.2: For arbitrary a. b > 0, we have that 

1) the function y{x) = (1 — x) log ^1 + j is concave 
over [0,1), and linx,.^!- y'(x) < 0; 

2) the function g(x) = (1 — a;) log ^1 + is concave 

and non-increasing over [0,1). 



Proof: Please see Appendix B. ■ 

Since p £ [0,1), when R(p) is simply 

concave over p on [0,1) according to part 1) of Proposition 
13.21 When Bm °£^~ go < 1, according to Proposition 13.21 
Rn{p) is concave over [0, Bma * s E ~ Bo j ; and is non-increasing 
on [iimsglpBo ) l). Thus, R(p) cannot achieve its maximum 
' B ma -*fT Bo , l). Therefore, we treat this fact as a new 
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constraint over p , and rewrite problem dT2~b as 
maxG(p) = max(l - p) log ^1 + \h\ 2 
s.t. B 0 + pLE < B max 8 , 0 < p < 1. (13) 


Next, we establish the following proposition to solve prob¬ 
lem (TUT i. where the obtained solution is optimal for problem 

m as well. 

Proposition 3.3: The optimal solution p* for problem (THT i 
is given by: 


P 


* 


min {po, Bma [ S E B ° } , when > log(l + C ); 

0, otherwise. 


where C = , D = ^]. 2 g , and po is the unique 

solution for the equation log ^1 + C ^ D p p ^ = i^^+c+Dp 
when >log(l+C0. 

Proof: Please see Appendix C. ■ 

Based on the optimal solution p*, the optimal V* for R(V) 
in CD can be obtained easily: We only need to compare 
R([p*L/l\) against R(\p*L/l]), and V* should attain the 
larger value. Specifically, we have the following result. 

Proposition 3.4: The optimal V* of the problem ( fTTb is 
given by 

r [p*L/l\, if R{[p*L/l\) > R(\p*L/l]); 

V* = l \p*L/l], if R(\p*L/r\) > R([p*L/l\); (14) 

[ 0, otherwise. 

where p* is obtained by Proposition 13.31 Thus, the optimal 
stopping rule M* is given by 


M* 


0, if(R(V*)-X)L<S*(X); 

V*, otherwise. 


The optimal reward (7o(J r o) with constant EH rate model is 


Uq(To) = max{(i?(U*) - X)L, S* (A)} . (16) 


Next, the following proposition formally quantifies the 
optimal stopping rule N* and the equation to compute the 
optimal throughput A*. 

Proposition 3.5: The optimal stopping rule to solve prob¬ 
lem © is given by 


N* = min{n > 1 : R n (V*) > A*} , (17) 


with V* given in Proposition 13.41 Moreover, A* satisfies the 
following equation 


(U*)-A*)" 


i=l 


XB 
L ’ 


( 18 ) 


where the function (a;) + means max{x, 0} for some real 
number x, and Q, is the probability of a successful channel 
contention at transmitter i, defined in Section [TT] The index 
n for R l (V*) in ( fl8l > is removed since {R n (U*)} n >i are 
ergodic for 1 < i < I. 

Proof: Following (fTtH t in Proposition 13.41 the stopping 
rule N* has the form 


N* = min {n > 1 : (R n (V*) - A *)L > S*(X*)} . (19) 


Thus, we can obtain N* by plugging S*( A*) = 0 into (fl9l >. 
which results in (ITTI i. Finally, equation (fl~8]> can be obtained 
by plugging S*( A*) = 0 into (flOb and taking the expectation 
on both sides. ■ 

Remark 3.4: Note that the stopping rule JTbl i implies that 
each transmitter has the same threshold that is globally deter¬ 
mined even when all transmitters have different statistics of the 
CSI and ESI. The intuition is similar to that in ED: In order 
to guarantee the overall system performance, the transmitter 
with a bad channel condition and a low energy level should 
“sacrifice” its own reward, while the one with good conditions 
should transmit more data. 

Directly following Propositions 13.41 and 13.51 the next propo¬ 
sition gives the DOS under the constant EH model. 

Proposition 3.6: After the n-th round of CP, it is optimal 
for the successful transmitter to take one of the following two 
options: 

1) release the channel immediately if R n (V*) < A* (which 
is equivalent to M* = 0), and let all transmitters perform 
the next round of CP; 

2) otherwise, transmit after V* slots for EH, where V* is 
given by Proposition 13.41 


C. Optimal Stopping Rule for i.i.d. EH Model 

Similarly as in the previous subsection, we first consider 
problem © to find the optimal stopping rule M*, then the 
optimal stopping rule N* afterwards. 

Under the i.i.d. EH model, L'o (To) has the form in ®. As 
we mentioned in Section IIII-AI it is a finite-horizon stopping 
problem lfl4l . Il22l . and the solution of problem © could be 
directly generalized in the next proposition. 

Proposition 3.7: For 0 < k < L/l and some A > 0, the 
optimality equation for problem © is given by 

U k (T k ) = max{(f?(fc) - A )L, -Xkl + S'*(A), 

E[U k+ i(E k+1 ) \ T k }} , (20) 

and the optimal stopping rule has the following form: 

M* = min {0 < k < L/l : 

U k (T k ) = ma x{{R{k) - A )L, -Xkl + S*(A)}} . (21) 

The stopping rule M* given in ( 12 1 [ i suggests that the EP 
would stop at M* by either transmitting or giving up the 
channel, which also indicates the final decision for the current 
round of CP. Thus, the optimal stopping rule N* could be 
obtained by reorganizing (l2TI> . 

Proposition 3.8: The optimal stopping rule of CP under the 
i.i.d. EH model has the form as: 

N* = min {n > 1 : C/ m *(^,m*) = - X*)L} , 

( 22 ) 

where M* is the optimal stopping rule of EP given in Propo¬ 
sition 13.71 The optimal throughput A* satisfies the following 
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equation 


£<3.e [e [max{i?*(M*) — A* 


A *1 
~L' 


-\*M*l/L}\E a } + 


(23) 


The proof is analogous to the constant EH rate case, which is 
omitted here. 

The next proposition, which directly follows Propositions 
I3.7l and [3~8l concludes the overall DOS under i.i.d. EH model. 


Proposition 3.9: After the n-th round of CP, it is optimal 
for the successful transmitter to take one of the following two 
options: 

1) if max{(f?„(0) — X*)L,'K[Ui(J r n p) | Jfo]} < 0, re¬ 
lease the channel immediately and let all transmitters start 
the next round of CP. 

2) otherwise, start EP following the optimal stopping rule 
M* given in Proposition 13.71 

Remark 3.5: Propositions 13.61 and 13.91 summarize the DOS 
under the constant and i.i.d. EH models, respectively. We 
observe that under the constant EH model, the EP could be 
“forecasted” by finding the optimal V*; then the decision 
of transmission would be made before starting EP. On the 
contrary, when the EH rates are i.i.d., such decision can only 
be made step by step during the EP. 


IV. Battery Dynamics 

In this section, we validate the assumption made in Section 
m that the energy level at each transmitter is stationary with 
some distribution. Firstly, we show that under the constant 
EH model, the energy level stored at each transmitter forms a 
Markov chain over time, while the state transition probabilities 
for different transmitters are coupled together. However, we 
propose an iterative algorithm to compute the corresponding 
steady-state distribution, which is shown converging to the 
global optimal point. Then, we extend our analysis to the case 
with i.i.d. EH rate model. 


A. Battery with Constant EH Model 

Note that after CP, if the successful transmitter releases 
the channel immediately, then the next round of CP starts, 
and the battery continues to be charged. If the transmitter 
starts the transmission, its energy level will become zero at 
the end of the transmission block according to Section QI] 
During this time, all other transmitters will keep harvesting 
energy within this period. Thus, the energy level transition 
over the transmission block can be determined. To simplify our 
analysis, the transmission block is treated as one time slot with 
length L for the purpose of counting battery state transitions. 
In addition, we assume that the battery works in half-duplex 
mode, i.e., it cannot be charged when the transmitter transmits 
data. 

For transmitter i with EH rate E l , 1 < i < I, 

the set of its energy states is given by B\ £ A, = 
{0, EH, 2 EH ■■■ , EH, B max 5}, where t > 1 is the 



Fig. 2. The state transition of the energy level at transmitter i under the 
constant EH rate model. 


slot index. The state transition is depicted in Fig. [2] In addition, 
we denote the distribution of the energy level for transmitter 
i at time t as f = [t t\ 0 ■ ■ • . 

Next, we consider the state transition probability. Suppose 
that transmitter i is at energy level Ui £ Aj, there are three 
events that may happen at time slot t\ 

(i) It occupies the channel and transmits. According to 
Section HQ transmitter i consumes all the energy for the 
transmission, and transfers to the energy level 0 after the 
transmission. Thus, the transition probability is given by 

Pu^ 0 = QiPlr( U i): (24) 


where Qi is the probability that the 7-th transmitter occupies 
the channel, and p\ r (ui) is the probability that it successfully 
transmits with the energy level Ui. Furthermore, according to 
©. p l tr (u.i) can be computed as 


pi(u i ) = P{R i en>A*} 


=P < log 


l + |/if 


Ui + V*lE l \ 
(L/l — V*)laf ) 



(25) 


where V* is defined by (ITIl i in Proposition 13.41 Note that in 
<ED>. |/if is the only random variable and its distribution is 
known. 


(ii) Other transmitters occupy the channel and transmit. 
If anyone among the other I — 1 transmitters sends data, 
transmitter i will harvest E l L units of energy during this 
period, and then attain level Vi = min [u + E l L, B ma x5}- 
Suppose that the j-th transmitter transmits. Similar to the 
first case, the probability of transmission performed by the 
j-th transmitter is given by Qj Ylb=o x n t bPtribEH ), where 
bEH £ Aj and thus b £ {0,1, 2, • • - , ,B max ). 

Since there are in total I — 1 transmitters, the transition 
probability for the transmitter i from level «, to v, is given by 


Pui,vi = Y! 7r lbPtr( bEJl )- ( 26 ) 

iAi b =0 


(iii) No transmission happens. In this case, transmitter i 
just harvests E l l units of the energy and goes into state 
Wi = min {ui + E l l, B max 5}. The probability of this case 
happening can be directly obtained as 


P 


l 

Ui ,Wi 


1 -pL.o -P‘. 


(27) 


Note that when Ui = Vi = Wi, the transition probability is just 
given by 


= Pu 
= 1 -Pi 


= Pu.Vi + 1 -P l Ui .O -P 


Ui, 0 ' 


(28) 













In this way, we can compute all {p 1 _ } for 1 < i < I, 
where Ui £ A, and Ui £ {0, Vi, wt 1 B max 8}. The transi¬ 
tion probability matrix is nothing but PJ = {p' u ~ } with 
dimension + l) x (|~ B g° ; a<5 ] + l). Obviously, PJ is 

a stochastic matrix, i.e, a square matrix in which all elements 
are nonnegative and the row sum is 1. However, P'J depends 
on t since p' v ,, depends on the state distribution I I) for all 
j ^ i. Therefore, {/ij} />() is a non-homogeneous Markov 
chain, whose state evolution is given by 

nj +1 = njpj, t > o. (29) 

We propose Algorithm!]] which is summarized in Table I, to 
compute the steady-state distribution for all transmitters. Here, 
the infinity norm is applied, which is defined as || a ||oo= 
maxi<j< n | a* | for a = [ai • • • a n ]. 

TABLE I 

Algorithm© Compute the steady-state distribution for all 

TRANSMITTERS. 


• Initialize Hg for 1 < i < I, e, and compute p‘ v (l by (l24b 
for all Ui £ A; and 1 < i < /; 

• Set t = 0, compute Pg by (I26l i- (l28l > for all 1 < i < I, 
and compute n{ by ( 1291 ) for all 1 < i < I. Then: 

- While maxi<i</ || nj +1 — IP t ||oo> e, repeat: 

1) t = t + l; 

2) Update Pj by (ED-® for all 1 < i < 7; 

3) Compute nj +1 by ( l29l ) for all 1 < i < I; 

- end. 

• Algorithm ends. 


Proposition 4.1: For any given initial state distribution 11),, 
nj = \tt\ g • • • 7rJ b ] that is generated by Algorithm Q] 
converges to a unique steady-state distribution H 1 for all 
1 <i<I. 

The proof is given in Appendix D. 

Remark 4.1: The steady-state distribution for all transmit¬ 
ters can be obtained by the iterative computation II t+ i = 
n t p over the “super” Markov system as well, which is 
constructed in Appendix D. However, this is not as efficient 
as Algorithm U From the computational complexity point 
of view, suppose that each transmitter has m energy levels, 
and there are n transmitters in total. The number of the 
states in the “super” Markov chain is m n . If there is only 
one processer, the floating-point calculation for one iteration 
of the state distribution for the “super” Markov chain is 
approximately on the order of 0(2m 2n ). On the contrary, 
by using Algorithm Q] ( f26t requires n 2 m 2 calculations, and 
updating {PJ} requires about run calculations according to 
(l27l >. In addition, { 1 1JP) } requires 2 nm 2 calculations. Overall, 
one iteration for all transmitters is approximately on the order 
of 0(n 2 m 2 ), which is more efficient than the case for the 
“super” Markov chain especially when m and n are large. 
Moreover, our algorithm can also be operated in a parallel 
way, i.e., computing nj, 1 = IIJP) for 1 < i < n at the same 
time over different cores. 


B. Battery with i.i.d. EH Model 

The argument that the battery state evolves as a Markov 
process for the random case is analogous to that of the constant 
case in the previous subsection. The main difference is that the 
probability p\ r {ui) defined by (l25l i is changed, which needs 
to be further developed under the i.i.d. EH rate model. 

We now consider the calculation of p\ r (v,i). When transmit¬ 
ter i grabs the channel with energy level according to the 
stopping rule M * <EU and N* (122b . the transmitter checks the 
condition max{(i?.(0) — A )L, —XI + E[(7i(J r i) | J^]} > 0. If 
it is true, the transmitter starts EP until the M*- th slot and 
transmits when ( R(M*) — \*)L > —A *M*l according to (l22b . 
Specifically, given Uo(ui , \h l \ 2 ) > 0, the transmitter continues 
EP at slot k for 0 < k < M* — 1, which is equivalent to 
max{(f?(fc) - \*)L,-\*kl} < E[Uk+i{Tk+i) \ Fk], where 
Fk = {ui + Ep |/P| 2 }- Then, at slot M* = m < L/l, 

the transmitter stops EP and transmits when ( R(m ) — A *)L > 
max{—A*m(,E[t/ m+ i(J r m+ i) | J r m ]}. Thus, we obtain 

POO 

p\ r { Ui ) = / P {Transmits at M* \ Uo(ui, d\h l \ 2 ) > 0} • 
Jo 

P{Ef 0 («i,d|/iT) >0}f(\h*\ 2 )d\h l \ 2 , (30) 

where f(\h l \ 2 ) is the probability density function (PDF) of the 
channel power gain. The probability P {f7o(itj, d\h l \ 2 ) > 0} 
can be computed based on Proposition 13.71 For notation 
simplicity, we omit the condition [7o(iq, d\h l \ 2 ) > 0, and the 
first term in the integral of l30b can be expanded as 

L/l / m—1 \ 

P{Transmits at M*} = e n P {&k < 0} J P {(3 m < 0} 

m —0 \ k =0 / 

(31) 

where a k = max{(f?(fc) - A*)L, — AAiZ} - E[U k +i(F k+1 ) \ 
Fk\, and /3 m — max{ XtyiI : E[t/ m -f_i(^'^-^i) | F 
(R(m ) — A *)L. Note that in P{afc<0}, R(k) and 
E[[/fc+i(J r i c +i) | F k ) are random since they are the functions 
of E -o where { E )}i<j< k are i.i.d. with a known 
distribution and Eq = 0. Thus, P {a k < 0} can be computed. 
Using the similar argument, it is easy to see that P {/3 m < 0} 
can be computed as well. Therefore, the probability given in 
(EB is computable. Overall, we could obtain p\ r [ui) after 
plugging (ED into (f30l) . 

After obtaining p l tr (ui), the transition probability {p l u ~ }, 
where m £ A, and Ui £ + 5, ■ ■ ■ ,B max 8}, can 

be calculated similarly as the case of constant EH rate. In 
addition. Algorithm Q] and Proposition 14.11 could be modified, 
such that they could suit the i.i.d. EH model, which is omitted 
in this paper. 

V. Computation of the Optimal Throughput 

The optimal throughput A* hinges upon the optimal stop¬ 
ping rules in (fT71) and d22l) . Thus, to fully obtain the optimal 
scheduling policy of the proposed DOS, we next turn our 
attention to computing the value of A*. 

By Propositions 13.51 and 13.81 A* can be obtained by solving 
® or d23l > under the constant or i.i.d. EH model, respectively. 
Next, we briefly introduce the idea why there exists A* such 
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that the equation ( fj~8] > or (j23j holds, and how to search A*. 
For brevity, we focus the constant EH rate case. 

Note that R(V*) is a function of random variables h l and 
B'lj ; we could calculate the expectation on the left-hand side 
of (fl8l) for each given A > 0. Such expectation requires the 
distribution of Bq, i.e., the steady-state distribution 11®, which 
could be approximately computed as shown in Section [IV] In 
addition, for a given A, an upper bound of this expectation can 
be obtained by fixing n® = [0, • ■ • 0, 1]. As A increases from 
zero to infinity, this upper bound decreases to zero at some 
A < oo. Since the right-hand side of ( 1 1 8b is strictly increasing 
over A within the range [0, + 00 ), there at least exists one A* 
satisfying (®. Therefore, an exhaustive one-dimension search 
can be applied to obtain the optimal throughput over the range 


0, A 


. Note that during each iteration of the exhaustive search. 


Algorithm |I] (given in Section II Vl is used to obtain the steady 


state distribution for a given A G 


if the equation (Qji 


largest one in 


0, A 


0, A 


and then we check 


or ([23} holds. Finally, A* should be the 


that makes the equation (IT8l> or (123} hold. 
In summary, the above search can characterize the optimal 
stopping rules given in Propositions 13.51 and 13.81 which com¬ 
pletes the proposed DOS framework. 


VI. Numerical results 


In this section, we first validate Propositions 13.51 and 13.81 to 
show that the optimal throughput A* exists and can be found 
via one-dimension search. Second, we investigate the through¬ 
put gain of our proposed DOS with two-level probing over 
the best-effort delivery method, where the data is transmitted 
whenever the channel contention is successful. Note that such 
a method can be realized in the proposed DOS framework 
by fixing M = 0 and setting N = 1 in ([XT} and (l22l i. Let 
Ao denote the throughput obtained by the best-effort scheme, 
which can be calculated as 


Ao = 


zL [ Ll °s (1 + 1 / 4 , 


12 K,o 

I La 2 


h + L 


(32) 


In general, a typical button cell battery has the capacity 
of 150 mAh with the end-point voltage of 0.9 V, which is 
equal to 150 mAh x 3600 s/h x 0.9 V = 486 J. A thin-film 
rechargeable battery can offer 50 pAh with 3.3 V, which is 
equal to 0.594 J. Since a typical transmission time interval is 
on the time scale of milliseconds, we let the energy unit be 
5 = 10 -3 I in the simulation. Accordingly, we set the capacity 
of the battery B max 5 = 10 5 5, which falls between the capacity 
volume of a thin-film battery and that of a button cell battery. 
Also, the current commercial solar panel can provide power 
from 1 W to about 400 W, which is equivalent to 15-ms -1 
~ 4005-ms -1 . According to this fact, in our simulation, we 
let the EH rate vary within the range [0,405]. In addition, 
the channel gains are i.i.d for different links and the channel 
power gains follow an exponential distribution with mean 5. 
The variance of the noise is set to be 10 mW. The length 
of one time slot is unified as l = 1 ms and the length of a 
transmission block is L = 1007 



Fig. 3. A v.s. the average throughput. 


1) Validation of Propositions 1-01 and El In Fig. 0 we 
illustrate the variation of the average throughput as the “thresh¬ 
old” A changes. Without loss of generality, we first consider 
a homogeneous network with 10 user pairs, i.e., all pairs are 
identical. For the constant EH model, the EH rate is set to be 
E = 105 for all transmitters. For the i.i.d. EH case, we choose 
the Bernoulli model l25l . ||26| |: The EH rate is either zero or 
of a finite value with probability 0.5. In our simulation, we 
consider three cases for the mean values in i.i.d. EH model: 
7.55, 105, and 205. 

First, we observe in Fig. [3] that as A increases from zero, 
the average throughput is increasing then decreasing. Then, the 
optimal point is achieved at A*, where the average throughput 
is at its apex that is also approximately of the same value 
as A*. Taking the case of i.i.d. EH model with mean 205 as 
an example in Fig. [3] the value of the optimal throughput 
A = A* is approximately 4.5, and the actual optimal average 
throughput is about 4.5 as well. Therefore, this observation 
validates our Propositions 13.5113.81 and discussions in Section 
m Second, we observe that the average throughput is almost 
the same when the mean of the EH rate in the i.i.d. EH model 
is equal to the EH rate in the constant EH model. Thus, the 
type of EH rate models does not directly determine the average 
throughput performance. 

2) Throughput gain: We use A ep to denote the throughput 
where only EP is adopted, i.e., setting N = 1 and M = M *, 
and Xqp to denote the throughput where only CP is adopted, 
i.e., setting N = N* and M = 0. Thus, the throughput gains 
are defined as: 

{ G E p = Xep x ~ Xo , gain from EP; 

G C p= Xcp f Xo , gain from CP; (33) 

Gdos = A Aq A ° , gain from CP + EP. 

In Fig. [4] we evaluate the above throughput gains for the 
network with 7 = 3 user pairs. Recall from Section [TT] that our 
analysis is applicable for I > 2. Since the constant and i.i.d. 
EH rate models could attain the same throughput performance 
over A, we only consider the constant EH model in this case. 
Particularly, we study a heterogeneous case where the first two 
transmitters have the same EH rates 25, while the EH rate of 
the third transmitter varies from 25 to 1005. 
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EH rate of the third transmitter (8) Total Number of Pairs 


Fig. 4. The throughput gain v.s. EH rate of the third transmitter. 


Fig. 5. The throughput gain v.s. the size of the network. 


We observe in Fig. 0] that as the EH rate of the third 
transmitter increases, Gep almost keeps constant and can 
achieve a gain about 19%. It implies that after the channel 
contention, the successful transmitter with any EH rate could 
do EP to enhance its average transmission rate over the 
transmission block. Thus, the ESI of the successful transmitter 
does not have obvious impact on the throughput. However, we 
notice that Gcp achieves its maximum when all transmitters 
are identical (with the same EH rate 25) and then decreases 
slowly as the EH rate of the third transmitter increases. 
The intuition is that when the difference among EH rates 
becomes larger, the stopping rule of CP will more likely let 
the transmitter with relatively low energy level to give up the 
channel, which results in a longer time on CP and then the 
throughput gain is lower than the case when all transmitters 
are identical. Regarding Guos, our proposed DOS with two- 
stage probing can achieve the highest throughput gain among 
three schemes. It is worth noticing that as the EH rate of 
the third transmitter increases, the efficiency of DOS becomes 
more apparent, although slowly, than the scheme with pure 
CP, which implies that the second stage probing brings more 
benefits. Our intuition is that a larger difference among the EH 
rates leads to a bigger difference of energy levels. Since EP 
allows the successful transmitter with relatively lower energy 
level to possibly harvest more energy after CP, EP will plays 
a more important role as the difference among the EH rates 
increases. 

In Fig. 0 we illustrate how the size of the network influ¬ 
ences the throughput gains. In this scenario, we start from a 
three-pair network with EH rates 25, 25, and 805, respectively. 
Then, we keep adding pairs with EH rate 25 at the transmitter 
side. We observe that the throughput gain Gcp is increasing 
a little as the size of the network is increasing. It is reasonable 
since CP could utilize the multi-user diversity of both channel 
gains and energy levels. We see that Gcp increases slowly, 
since we only add a low-EH-rate transmitter at each time. 
We also observe that Gep is decreasing. The reason is that 
the more transmitters in the network, the less probability to 
transmit for each transmitter, and then more transmitters would 
maintain a high energy level. Thus, EP is rarely triggered 
after a channel contention. For the same reason, Gdos would 


approach Gcp as the size of the network increases. 

VII. Conclusion 

In this paper, we proposed a DOS framework for a hetero¬ 
geneous single-hop ad hoc network, in which each transmitter 
is powered by a renewable energy source and accesses the 
channel randomly. Our DOS framework includes two succes¬ 
sive processes: All transmitters first probe the channel via 
random access, and then the successful transmitter decides 
whether to give up the channel or to optimally probe the energy 
before data transmission. The optimal scheduling policy of 
the DOS framework is obtained as follows: First, assuming 
the battery state is stationary at each transmitter, the expected 
throughput maximization problem was formulated as a rate- 
of-return optimal stopping problem, which was solved for 
both the constant and i.i.d. EH rate models; second, by fixing 
the stopping rule, the stored energy level at each transmitter 
was shown to own a steady-state distribution as time goes to 
infinity, where we also proposed an efficient iterative algorithm 
for its computation; finally, the optimal throughput and the 
scheduling policy is obtained via one-dimension search with 
the above two steps (i.e., finding the form of the optimal 
stopping rule and calculating the steady-state distribution) 
repeated in each iteration. Numerical results were also pro¬ 
vided to validate our analysis; the proposed DOS with two- 
level probing was shown to outperform the best-effort delivery 
method. 

Appendices 

A. Proof of Proposition li.il 

For the first part of Proposition 13.11 it follows by Theorem 
1 in Chapter 3 of lfl4l that N*( A) exists and S'* (A) is attained 
by this N*( A) if the following two conditions are satisfied: 

(Cl) limsupjv->oo Ov(A) < r_oo(A) a.s.; 

(C2) E [sup ]v > 1 r A r(A)] < oo, 
where rjv(A) is given by (0}. As we pointed out in Section 
m the energy level /i v o is stationary for N > 1. Although 
{Rn{Mn>i are independent, it may not be identically 
distributed with respect to h]y and /Tv.o- However, it is not too 
difficult to show that (Cl) and (C2) hold. The idea is that we 
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first consider that every transmitter has the same statistics; then 
we apply the channel contention probability as the summation 
coefficients over all transmitters. 

For (Cl), if we assume that all transmitters have 
the same statistics as transmitter i, then {R , n (M^ [ )}n> 1 
become i.i.d.. Since E [R l N (M* n )] < oo according 

to Section uni and the accumulated cost A7’y = 
XI (jKn + J2n=i(Kn + M*)^J —^ oo as N — y oo a.s., we 
obtain that P {limsup^^o,-, r l N { A) = — 00 } = 1. Recall from 
Section mi that the channel is occupied by transmitter i with 
probability Qi and Yli =1 = 1> we obtain that 

1 = ^-P {lim sup r l N ( A) = —00 

Q l N^oo 

= P < limsuprjv(A) = — 00 

N—foo 

which proves that (Cl) holds. 

For (C2), it can be shown that 


E 


sup r l N ( A) 

N>1 


= E 

< E 


sup ((R^(M* n ) - X) L - XT n ) 

N> 1 


sup (R i N (M* N ) - X(IN + L)) 

N> 1 


(34) 


due to the fact that K n > 1 and M* > 0 for 1 < n < N. Since 
E (R' n (M* n )Y < 00 , it follows that the right-hand side of 
(l34l > is finite by Theorem 1 in Chapter 4 of fl4l . Similar to 
the technique in the proof of (Cl), we have 


E 

sup r N (X) 


sup r l N ( A) 


N> 1 


N>1 


which shows that (C2) also holds. 

For the second part, we know that with the cost XIKn at 
the A ; -th CP for any N > 1, the successful transmitter could 
choose one of three actions: transmits immediately with reward 
(.R/v (0) — A )L; or gives up the channel immediately, and 
obtains the optimal expected net reward S'* (A) based on the 
property of time invariance described in Section lTH-Al or starts 
EP and obtains the expected net reward E [(7i(J r /v,i) | .F/v,o]- 
Thus, by the optimal stopping theory lfl4l . l2H . S*(A) satisfies 
the optimality equation under (C2) as 


S*(A) = - A IK n + 

max{S'* (A), (R N ( 0) - X)L,E[U 1 (T NA ) | ^, 0 ]} , 

which is equivalent to (ITot . 


B. Proof of Proposition \3.2\ 

For 1), we show the concavity of function y(x) by checking 
its second-order derivative over [0,1), which is given by 

*"(*) =-EHE- < 0 . 

(1 — a;) [o + 1 + (6 — l)a;] 2 

Therefore, y(x) is concave over [0,1) l23l . To prove the 
second part of 1), we check the first-order derivative of y(x), 


which is given by 


y'(x) 


log 



a + bx\ 

1 — x J 


a + b 

1 — x + a + bx 


(35) 


It is easy to see that as x —> 1~, the first term of the right- 
hand side of (l35l > goes to negative infinity, while the second 
term is bounded. Hence, y'(x) is strictly negative as x —>• 1~. 
Therefore, part 1) is proved. 

Next, we prove 2). By checking the second-order derivative 
of g(x), we obtain 


g"(x) = - 


(1 — x)(a + 1 — xf 


< 0 , 


which implies that g(x) is concave. For the second part of 2), 
we consider the first-order derivative of g(x), which is given 

b y 


g'fr) 


log 


1 + 



a 

1 — x + a 


(36) 


Since g"(x) < 0, it follows that 

max g'{x) = g'{ 0) = - log (1 + a) + . 

0<rc<l 1 + a 


Moreover, due to the fact that ^ log (1 + a) + 
— ( 1^)2 < 0 for arbitrary a > 0, we obtain 


max g'{x) = ^'(O) < ( — log (1 + a) + 

0<x<l y 


1 CL 


= 0, 


a —0 


which proves the second part of 2). 


C. Proof of Proposition li.il 

According to Part 1) of Proposition l3.2l we obtain that G(p) 
is concave over p S [0,1), which means that G'(p) = 
is decreasing over [0,1) and attains its maximum at p = 0. 
Then, finding the maximum of G(p) boils down to two cases: 

1) G'(p) | 0 < 0: It follows that G{p) is decreasing over 

[0,1), and p* = 0 is the optimum. 

2 ) g '(p)\ p =o ^ 0: The P oint PO’ satisfying G'(p)\ p=po = 
0, lies on the right-hand side of p = 0. By Part l) of 
Proposition 13.21 G'(p) < 0 as p —> 1~, which implies 
that po G [0,1). Since the optimal point p* < Bm y s ~ B - 
due to (fl3l) . it follows that p* = min {po, Bma Y F f B ° }. 

Note that G'(p )| p=0 > 0 is equivalent to Y+c — ^°§(1 + ^)’ 
where C = B ° > 0, D = ^ 2 E > 0, and G'(p)\ = 0 

is equivalent to 


log 



C + Dpo A 
1-A) ) 


C + D 

1 — Po + C + Dpo 


(37) 


Next, we show that when > log(l + C). (l37l > has a 
unique solution. For p G [0,1), the left-hand side of (l37l > is 
increasing over p from log (1 + C) to + 00 . For its right-hand 
side, we have the following two cases: 

1) D > 1: The right-hand side of (1371) decreases from ( ~Y~U 

to 1. Since > log(l + C), there exists a unique 

solution po for (1371) : 

2) 0 < D < 1: The right-hand side of (l37l > increases from 
F-± -R- to 1. If the first-order derivative of the left-hand side 
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of ( 1371 is always greater than that of the right-hand side, 
there must be only one solution for ( l37l when { 1 > 

log(l + C). Thus, we check their first-order derivatives: 
For the left-hand side of (ITT! , we obtain 

_d_ / C + Dp \ _ C + D _ 

dp ° S V + 1 ~P ) (d-p)(l + C + (D-l)p)’ 

(38) 


for the right-hand side, we have 

d_ ( C + D \ - iC + D){l-D) 
dp \ 1 — p + C + Dp) (1 + C + (D — 1 )p) 2 

(39) 


Thus, by calculating the difference between (l38l and ( l39l . 
we arrive at 


C + D 

(l-p)(l + C+(D-l)p) 
_ (C + D) 2 _ 

(l-p)(l + C+(D-l)pf 


(C + D)(l-D) 

(l + C + (D-\)pf 

> 0. (40) 


Therefore, there exists a unique solution po satisfying 

(EJ. 

In conclusion, the proposition is proved. 

Remark : Since it is proved that pa is unique in (137b . po can 
be found just by adopting a simple one-dimension searching 
method, e.g., bisection search. 


D. Proof of Proposition 14.71 

To prove this proposition, we construct an axillary “super” 
Markov chain in which each state is a “super” vector of 
aggregated energy levels across the whole network, whose 
transition probability matrix does not change over time t. 
Afterwards, we prove that such a “super” Markov chain has 
a unique steady-state distribution. Then, we show that for any 
time t in the original Markov chain, one iteration for updating 
II) for 1 < i < I in Algorithm Q] is equivalent to the evolution 
of the state distribution in the “super” Markov chain, thereby 
proving the convergence of Algorithm Q] 

To construct such a “super” Markov chain, we need to 
jointly consider the states of energy levels across all transmit¬ 
ters. Let £ denote the set of all possible battery states over 
the whole system, i.e., 

£ = {u = (ui ■ ■ ■ ui) : Mi £ Ai, • • • , ui £ A/} . (41) 


Furthermore, we use B, to denote the battery state of the sys¬ 
tem at time t, and thus we have B t £ £. Note that the number 
of elements in £ is ( ["+ l) x • • • x ([+ l) ■ 
Suppose that B t = u. There are / + 1 possible events at 
time t : A transmission is performed by transmitter i, where 
1 < i < I, or no transmission happens. 

If the 7-th transmitter transmits, there is B t+ i = v,, where 
Vi £ £ and 


/ min{ui + E 1 L, B max 5} \ 

0 


\ min{'u/ + E T L, B max 5} J 


in which the 7-th element is zero. According to ([24])- the 
corresponding transition probability is given by 

Pu.vi = QiV l tr (ui ), 1 < 7 < I. (42) 


If no transmission happens, all transmitters just harvest 
energy for one time slot. Then, we obtain B t+ i = w, where 
w £ £ and 


w = 


/ minjui + E 1 !, B max S} \ 
min {it, + EH, B max 5} 

\ min{ui + E ! l, B max S} J 


T 


The corresponding transition probability is just the comple¬ 
ment of the transmission probability over all other possible I 
cases, which is given by 


/ 

Pu.w = 1 ^ ^ QiPtri'U'i)' (43) 

i=l 

Therefore, {B f } t >o is a unichain (24), i.e., a finite-state 
Markov process that contains a single recurrent class. By cal¬ 
culating the transition probability for each u £ £, we obtain 
the transition probability matrix P for {B, }t>u- Clearly, P is 
a stochastic matrix and is invariant over time. Therefore, there 
exists a unique probability vector II such that II = IIP holds 
[241. In fact, II is the steady-state distribution of {B f } t >o. 

So far, we have constructed a “super” Markov chain 
{B t } t >o for the whole system, for which the steady-state 
distribution exists and is unique. Therefore, by the iteration 
n t+ i = n t P, we have lim^oo IIj = II. Thus, it suffices to 
show that 


n] +1 = n{p] : 


n t+ i = n f p -v=> 


n 


t+i — LL t*ti 


t > 0, (44) 


n t +1 = n{p{. 


if Eli is true, the state distribution of each transmitter 
converges to the unique steady-state distribution. 

Next, we are going to show that both the directions “=>” 
and “<(=” of ( 1441 hold. For notational simplicity, we omit 
the time index t. In fact, the direction “<=” is the same as 
constructing the “super” Markov chain as discussed earlier. 
If the system is at state u = (b\E 1 l ■ ■-biE 1 !), where 
bi £ {0,1,2,-•• , [ B eh S \ , B ma X }, 1 < 7 < I, the prob¬ 
ability II(u) is the joint probability over all transmitters, 
i.e., n(u) = nLi A . The way of constructing transition 
probability matrix P is given by (l42l and ( l43t , which can be 
obtained directly from (l24l for { P ' }. Thus, both II and P 
can be obtained from the right-hand side of (l44l . 

For the direction “=>” of gl, we need to show how 
we obtain {II 1 } and {P ! } from the left-hand side of (l44l . 
We consider {II 1 } first. Given the state distribution II of 
the system, there exists an one-to-one mapping from each 
element of £ to that of II. Let II(u) denote the probability 
of the system staying at state u £ £. Obviously, there is 
n(u) = 1. Then, we consider the subset of £ such 
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that transmitter i stays at state u £ A*, i.e., 

= {u = (ui • • • Ui ■ ■ ■ m) : 

ui £ Ai, • • • ,Ui = u,--- ,m £ A/} . (45) 




EE 

jV* 6=0 


i=u,Uj=bEH j 


1 {S Ui=M —>• S Ui=: . 


"Ui I J Ui=u,Uj—bEH 


}) 


Clearly. (l45l i satisfies U u eA Then, the prob¬ 

ability that transmitter i stays at state u = bE l l , where 
b £ |0,1,2, • • • , L^rf^J , B max }, is equal to the probability 
that the system is staying at H, Ui - u , i.e., 

tt£ =P{E Ui=u } = Y, n ( u )' ( 46 > 

ues„ i= „ 

In this way, we can obtain the state distribution II 1 for 
transmitter i such that II 1 = [ttq ■ ■ • 7t£ • • • 7tg ]. 

Next, we consider {P*}. When transmitter i stays at 
the energy state u £ A,;, it can transfer to state 0, Vi, 
or V 2 , where v\ = min {u + E I L, B max 6}, and V 2 = 
min {it + E l l, B max 6}. Accordingly, from S Ui=M , there are 
three possible cases: 


1) —» S Ui= o: For each state u £ E Ui =„, there 

is only one possible route to S Ui =o with probability 
Qip\ r {u) such that transmitter i transmits and goes into 
state 0. In fact, such transition probability does not change 
for any u £ S„ j= „. Thus, by taking all possible states 
into account, the transition probability can be computed 
by 


Pu,0 ~ ^ {^Ui=u — * S Ui= o 
_ QiP\ r {u) P{S U< =4 
P {E U4=U } 


| E Uj=u } 

= QiPtr (^0 i 


(47) 


which is equal to (f24t . 

2) Yi Ui - u —> S Ui — Vl : For each state u £ S Ui — u , there 
are I — l possible routes to H Ui — Vl . We pick the route 
caused by transmitter j ^ i, i.e., the j-th transmitter 
transmits. Suppose that at state u, the transmitter j is in 
the energy state bEH £ A j. The probability of staying at 
^ui=u,u,j=bEH is given as 7^P{£„,=„} by (|46j. Thus, 
the transition ^ Ui=u , Uj= bEH -f S Ui=Wli „ 3 . =0 describes 
the transition of transmitter i from state u to state v\ 
caused by transmitter j with energy level Uj = bEH. 
Similarly as in ( l47l ). the transition probability for this 
case is given by 


iP 5 \ ^Ui=u,Uj=bEH ^ ^ Ui=vi,Uj=0 | ^Ui=u,v,j—bEil 

_ Q j pj r (bEH)¥ { 

^ Ui=u,Uj—bEil } 

P {^u,i=u,Uj=bEn} 

=Q 3 p{ r {bEH). 


When we extend to other transmitters besides i, and con¬ 
sider all possible states for each transmitter, we obtain the 
probability of the one step transition £„ i=u —> S Uj - tl 
as 


P {S Ui=11 
P {S Mi=u 


^ ^Ui=V 1 7 


S Ui — U 

^Ui=U } 


P {S Ui=w } 


1 


1 {X Mi=u } 


E E (P {^u^bEu} 


j^i b—0 

Ui=u,Uj—bEH ^ ^Ui=vi,Uj=0 I ^Ui—u,Uj—bE^l } ) 
Bmax 


1 


5 {S Ui=u } 


E E {V m=u }Q jP l(bEH) 


j^i b=0 


= E E KQjPlr^B 3 !)- 

6=0 


(48) 


Thus, (l48l ) is equivalent to ( 126k 
3) H Ui — u —> S Ui= „ 2 : The transition probability for this case 
can be obtained by taking the complement of ( |471 > and 
([48}, which is equivalent to ( ITT} . 

Therefore, we obtain all possible transitions for transmitter i 
at time t, for which the corresponding transition probabilities 
can be computed as well. Thus, {II 1 } and |P' } are obtained 
from II and P, which proves the direction “=>” of (l44l i. 

Overall, the convergence of Algorithm [I] is proved. 


References 

[1] H. Li, C. Huang, S. Cui, and J. Zhang, “Distributed opportunistic 
scheduling for wireless networks powered by renewable energy sources,” 
in Proc. IEEE INFOCOM, Toronto, ON, Canada, 2014, pp. 898-906. 

[2] S. Sudevalayam and P. Kulkami, “Energy harvesting sensor nodes: survey 
and implications,” IEEE Commun. Surveys Tuts., vol. 13, no. 3, pp. 443- 
461, Third Quarter 2011. 

[3] B. Medepally, N. B. Mehta, and C. R. Murthy, “Implications of energy 
profile and storage on energy harvesting sensor link performance,” in 
Proc. IEEE GLOBECOM, Hawaii, HI, USA, 2009, pp. 1-6. 

[4] C. K. Ho and R. Zhang, “Optimal energy allocation for wireless com¬ 
munications with energy harvesting constraints,” IEEE Trans. Signal 
Processing, vol. 60, no. 9, pp. 4808-4818, Sept. 2012. 

[5] O. Ozel, K. Tutuncuoglu, J. Yang, S. Ulukus, and A. Yener, “Transmis¬ 
sion with energy harvesting nodes in fading wireless channels: optimal 
policies,” IEEE J. Sel. Areas Commun., vol. 29, no. 8, pp. 1732-1743, 
Sept. 2011. 

[6] C. Huang, R. Zhang, and S. Cui, “Throughput maximization for the 
Gaussian relay channel with energy harvesting constraints,” IEEE J. Sel. 
Areas Commun., vol. 31, no. 8, pp. 1469-1479, Aug. 2013. 

[7] S. Luo, R. Zhang, and T. J. Lim, “Optimal save-then-transmit protocol for 
energy harvesting wireless transmitters,” IEEE Trans. Wireless Commun., 
vol. 12, no. 3, pp. 1196-1207, Mar. 2013. 

[8] F. Iannello, O. Simeone, and U. Spagnolini, “Medium access control 
protocols for wireless sensor networks with energy harvesting,” IEEE 
Trans. Commun., vol. 60, no. 5, pp. 1381-1389, May 2012. 

[9] M. Andrews, K. Kumaran, K. Ramanan, A. Stolyar, P. Whiting, and 
R. Vijayakumar, “Providing quality of service over a shared wireless link,” 
IEEE Commun. Mag., vol. 39, no. 2, pp. 150-154, Feb. 2001. 

[10] P. Viswanath, D. N. Tse, and R. Laroia, “Opportunistic beamforming 
using dumb antennas,” IEEE Trans. Inf. Theory, vol. 48, no. 6, pp. 1277- 
1294, June 2002. 

[11] X. Liu, E. K. P. Chong, and N. B. Shroff, “A framework for opportunistic 
scheduling in wireless networks,” Comput. Netw., vol. 41, no. 4, pp. 451 - 
474, Mar. 2003. 

[12] D. Zheng, W. Ge, and J. Zhang, “Distributed opportunistic scheduling 
for ad hoc networks with random access: An optimal stopping approach,” 
IEEE Trans. Inf. Theory, vol. 55, no. 1, pp. 205-222, Jan. 2009. 

[13] D. Zheng, “Physical-layer aware control and optimization in stochastic 
wireless networks,” Ph.D. thesis, Arizona State Univ., Aug. 2007. 

[14] T. S. Ferguson, Optimal stopping and applications, 2006 [Online]. 
Available: http://www.math.ucla.edu/~tom/Stopping/Contents.html 









14 


[15] C. Thejaswi P. S., J. Zhang, M.-O, Pun, and H. V. Poor, “Distributed 
opportunistic scheduling with two-Level probing,” IEEE/ACM Trans. 
Netw., vol. 18, no. 5, pp. 1464-1477, Oct. 2010. 

[16] W. Stadje, “An optimal stopping problem with two levels of incomplete 
information,” Math. Methods of Oper. Res., vol. 45, no. 1, pp. 119-131, 
Feb. 1997. 

[17] M. Beaudin, H. Zareipour, A. Schellenberglabe, and W. Rosehart, 
“Energy storage for mitigating the variability of renewable electricity 
sources: an updated review,” Energy for Sustainable Development , vol. 14, 
no. 4, pp. 302-314, Dec. 2010. 

[18] S. Combs, The energy report. Chapter 10, May, 2008 [Online], 
www. window, state. tx. us/specialrpt/energy. 

[19] N. Abramson, “The throughput of packet broadcasting channels,” IEEE 
Trans. Commun., vol. 25, no. 1, pp. 117-128, Jan. 1977. 

[20] P. Billingsley, Probability and Measure, 3rd ed., New York: John Wiley 
& Sons, Inc., 1995. 

[21] T. S. Ferguson and J. B. MacQueen, “Some time-invariant stopping rule 
problems,” Optimization, vol. 23, no. 2, pp. 155-169, Jan. 1992. 

[22] G. Peskir and A. Shiryaev, Optimal Stopping and Free-Boundary Prob¬ 
lems, Basel: Birkhauser Verlag, 2006. 

[23] S. Boyd and L. Vandenberghe, Convex Optimization, Cambridge, U.K.: 
Cambridge Univ. Press, 2004. 

[24] R. Gallager, Discrete Stochastic Processes, Boston: Kluwer Academic 
Publishers, 1996. 

[25] A. Seyedi and B. Sikdar, “Energy efficient transmission strategies for 
body sensor networks with energy harvesting,” IEEE Trans. Commun., 
vol. 58, no. 7, pp. 2116-2126, July 2010. 

[26] M. Kashef and A. Ephremides, “Optimal packet scheduling for energy 
harvesting sources on time varying wireless channels,” J. Commun. and 
Netw., vol. 14, no. 2, pp. 121-129, Apr. 2012. 


